Overview

Dataset statistics

Number of variables16
Number of observations20000
Missing cells8261
Missing cells (%)2.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.4 MiB
Average record size in memory128.0 B

Variable types

NUM10
CAT6

Warnings

name has a high cardinality: 19768 distinct values High cardinality
host_name has a high cardinality: 6517 distinct values High cardinality
neighbourhood has a high cardinality: 217 distinct values High cardinality
last_review has a high cardinality: 1507 distinct values High cardinality
last_review has 4123 (20.6%) missing values Missing
reviews_per_month has 4123 (20.6%) missing values Missing
minimum_nights is highly skewed (γ1 = 25.17996962) Skewed
name is uniformly distributed Uniform
id has unique values Unique
number_of_reviews has 4123 (20.6%) zeros Zeros
availability_365 has 7176 (35.9%) zeros Zeros

Reproduction

Analysis started2021-10-02 05:39:28.425646
Analysis finished2021-10-02 05:40:26.994624
Duration58.57 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

id
Real number (ℝ≥0)

UNIQUE

Distinct20000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18923800.78
Minimum2539
Maximum36485609
Zeros0
Zeros (%)0.0%
Memory size156.2 KiB
2021-10-02T07:40:27.278258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2539
5-th percentile1193873.85
Q19393540.5
median19521168.5
Q329129358.75
95-th percentile35275607.45
Maximum36485609
Range36483070
Interquartile range (IQR)19735818.25

Descriptive statistics

Standard deviation11012232.42
Coefficient of variation (CV)0.5819249812
Kurtosis-1.233322955
Mean18923800.78
Median Absolute Deviation (MAD)9896304
Skewness-0.07538052651
Sum3.784760157e+11
Variance1.212692628e+14
MonotocityNot monotonic
2021-10-02T07:40:27.529552image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
324931121< 0.1%
 
27282931< 0.1%
 
307413021< 0.1%
 
53547961< 0.1%
 
363082661< 0.1%
 
311943091< 0.1%
 
155095451< 0.1%
 
130737011< 0.1%
 
207753701< 0.1%
 
1691521< 0.1%
 
Other values (19990)19990> 99.9%
 
ValueCountFrequency (%) 
25391< 0.1%
 
38311< 0.1%
 
50221< 0.1%
 
51211< 0.1%
 
52031< 0.1%
 
ValueCountFrequency (%) 
364856091< 0.1%
 
364850571< 0.1%
 
364802921< 0.1%
 
364797231< 0.1%
 
364783431< 0.1%
 

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct19768
Distinct (%)98.9%
Missing7
Missing (%)< 0.1%
Memory size156.2 KiB
Hillside Hotel
 
7
Brooklyn Apartment
 
7
Private Room
 
6
New york Multi-unit building
 
6
Home away from home
 
6
Other values (19763)
19961 
ValueCountFrequency (%) 
Hillside Hotel7< 0.1%
 
Brooklyn Apartment7< 0.1%
 
Private Room6< 0.1%
 
New york Multi-unit building6< 0.1%
 
Home away from home6< 0.1%
 
Private room in Manhattan5< 0.1%
 
Cozy Room5< 0.1%
 
Cozy Private Room4< 0.1%
 
Private room in Williamsburg4< 0.1%
 
Private room in Brooklyn3< 0.1%
 
Other values (19758)1994099.7%
 
(Missing)7< 0.1%
 
2021-10-02T07:40:27.924066image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique19599 ?
Unique (%)98.0%
2021-10-02T07:40:28.308556image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length179
Median length36
Mean length36.8906
Min length1

host_id
Real number (ℝ≥0)

Distinct17027
Distinct (%)85.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean67460344.07
Minimum2571
Maximum274273284
Zeros0
Zeros (%)0.0%
Memory size156.2 KiB
2021-10-02T07:40:28.599579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2571
5-th percentile779450
Q17853718.25
median31114309.5
Q3106842560
95-th percentile242095840.9
Maximum274273284
Range274270713
Interquartile range (IQR)98988841.75

Descriptive statistics

Standard deviation78579364.8
Coefficient of variation (CV)1.164823066
Kurtosis0.2087986308
Mean67460344.07
Median Absolute Deviation (MAD)27863859.5
Skewness1.219649017
Sum1.349206881e+12
Variance6.174716572e+15
MonotocityNot monotonic
2021-10-02T07:40:28.786437image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
2195178611310.7%
 
107434423890.4%
 
30283594530.3%
 
12243051370.2%
 
16098958360.2%
 
137358866360.2%
 
61391963340.2%
 
22541573320.2%
 
200380610270.1%
 
2856748240.1%
 
Other values (17017)1950197.5%
 
ValueCountFrequency (%) 
25711< 0.1%
 
27873< 0.1%
 
31511< 0.1%
 
34151< 0.1%
 
35631< 0.1%
 
ValueCountFrequency (%) 
2742732841< 0.1%
 
2741954581< 0.1%
 
2741033831< 0.1%
 
2740799641< 0.1%
 
2738701231< 0.1%
 

host_name
Categorical

HIGH CARDINALITY

Distinct6517
Distinct (%)32.6%
Missing8
Missing (%)< 0.1%
Memory size156.2 KiB
David
 
170
Michael
 
167
John
 
133
Sonder (NYC)
 
131
Alex
 
105
Other values (6512)
19286 
ValueCountFrequency (%) 
David1700.9%
 
Michael1670.8%
 
John1330.7%
 
Sonder (NYC)1310.7%
 
Alex1050.5%
 
Daniel950.5%
 
Blueground890.4%
 
Sarah870.4%
 
Maria870.4%
 
Chris830.4%
 
Other values (6507)1884594.2%
 
2021-10-02T07:40:29.061746image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique4203 ?
Unique (%)21.0%
2021-10-02T07:40:29.554544image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length35
Median length6
Mean length6.1112
Min length1
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Manhattan
8774 
Brooklyn
8265 
Queens
2355 
Bronx
 
441
Staten Island
 
165
ValueCountFrequency (%) 
Manhattan877443.9%
 
Brooklyn826541.3%
 
Queens235511.8%
 
Bronx4412.2%
 
Staten Island1650.8%
 
2021-10-02T07:40:30.184923image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-10-02T07:40:30.367866image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:30.696137image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length13
Median length8
Mean length8.1783
Min length5

neighbourhood
Categorical

HIGH CARDINALITY

Distinct217
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Williamsburg
1580 
Bedford-Stuyvesant
1503 
Harlem
 
1116
Bushwick
 
987
Upper West Side
 
798
Other values (212)
14016 
ValueCountFrequency (%) 
Williamsburg15807.9%
 
Bedford-Stuyvesant15037.5%
 
Harlem11165.6%
 
Bushwick9874.9%
 
Upper West Side7984.0%
 
Hell's Kitchen7713.9%
 
East Village7563.8%
 
Upper East Side7063.5%
 
Crown Heights6483.2%
 
Midtown6433.2%
 
Other values (207)1049252.5%
 
2021-10-02T07:40:31.062752image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique20 ?
Unique (%)0.1%
2021-10-02T07:40:31.372268image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length26
Median length12
Mean length11.8819
Min length4

latitude
Real number (ℝ≥0)

Distinct12439
Distinct (%)62.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.72845515
Minimum40.50873
Maximum40.91306
Zeros0
Zeros (%)0.0%
Memory size156.2 KiB
2021-10-02T07:40:31.719810image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum40.50873
5-th percentile40.6451295
Q140.68942
median40.72273
Q340.76299
95-th percentile40.8256525
Maximum40.91306
Range0.40433
Interquartile range (IQR)0.07357

Descriptive statistics

Standard deviation0.05475507699
Coefficient of variation (CV)0.001344393663
Kurtosis0.1085655141
Mean40.72845515
Median Absolute Deviation (MAD)0.03658
Skewness0.2301693808
Sum814569.103
Variance0.002998118457
MonotocityNot monotonic
2021-10-02T07:40:31.918627image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
40.686348< 0.1%
 
40.722328< 0.1%
 
40.718138< 0.1%
 
40.694148< 0.1%
 
40.726077< 0.1%
 
40.686837< 0.1%
 
40.705877< 0.1%
 
40.718017< 0.1%
 
40.680847< 0.1%
 
40.719926< 0.1%
 
Other values (12429)1992799.6%
 
ValueCountFrequency (%) 
40.508731< 0.1%
 
40.522931< 0.1%
 
40.530761< 0.1%
 
40.538711< 0.1%
 
40.538841< 0.1%
 
ValueCountFrequency (%) 
40.913061< 0.1%
 
40.905271< 0.1%
 
40.903911< 0.1%
 
40.903561< 0.1%
 
40.903291< 0.1%
 

longitude
Real number (ℝ)

Distinct10181
Distinct (%)50.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.95212502
Minimum-74.23914
Maximum-73.71795
Zeros0
Zeros (%)0.0%
Memory size156.2 KiB
2021-10-02T07:40:32.269435image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-74.23914
5-th percentile-74.0041805
Q1-73.98303
median-73.95564
Q3-73.93638
95-th percentile-73.864896
Maximum-73.71795
Range0.52119
Interquartile range (IQR)0.04665

Descriptive statistics

Standard deviation0.04655878323
Coefficient of variation (CV)-0.000629580059
Kurtosis4.938242884
Mean-73.95212502
Median Absolute Deviation (MAD)0.024895
Skewness1.255100378
Sum-1479042.5
Variance0.002167720296
MonotocityNot monotonic
2021-10-02T07:40:32.677380image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
-73.98589100.1%
 
-73.948299< 0.1%
 
-73.957429< 0.1%
 
-73.954279< 0.1%
 
-73.951219< 0.1%
 
-73.980439< 0.1%
 
-73.957259< 0.1%
 
-73.953328< 0.1%
 
-73.956758< 0.1%
 
-73.955098< 0.1%
 
Other values (10171)1991299.6%
 
ValueCountFrequency (%) 
-74.239141< 0.1%
 
-74.212381< 0.1%
 
-74.202951< 0.1%
 
-74.198261< 0.1%
 
-74.196261< 0.1%
 
ValueCountFrequency (%) 
-73.717951< 0.1%
 
-73.718291< 0.1%
 
-73.725821< 0.1%
 
-73.727161< 0.1%
 
-73.727311< 0.1%
 

room_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size156.2 KiB
Entire home/apt
10384 
Private room
9172 
Shared room
 
444
ValueCountFrequency (%) 
Entire home/apt1038451.9%
 
Private room917245.9%
 
Shared room4442.2%
 
2021-10-02T07:40:33.050817image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-10-02T07:40:33.167567image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:33.338053image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length15
Median length15
Mean length13.5354
Min length11

price
Real number (ℝ≥0)

Distinct544
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean153.26905
Minimum0
Maximum10000
Zeros5
Zeros (%)< 0.1%
Memory size156.2 KiB
2021-10-02T07:40:33.536510image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile40
Q169
median105
Q3175
95-th percentile350
Maximum10000
Range10000
Interquartile range (IQR)106

Descriptive statistics

Standard deviation243.3256089
Coefficient of variation (CV)1.587571717
Kurtosis538.297578
Mean153.26905
Median Absolute Deviation (MAD)45
Skewness18.3046896
Sum3065381
Variance59207.35193
MonotocityNot monotonic
2021-10-02T07:40:33.953745image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1008564.3%
 
1508214.1%
 
506363.2%
 
2005902.9%
 
755702.9%
 
605552.8%
 
805312.7%
 
704822.4%
 
1204712.4%
 
654712.4%
 
Other values (534)1401770.1%
 
ValueCountFrequency (%) 
05< 0.1%
 
106< 0.1%
 
112< 0.1%
 
121< 0.1%
 
131< 0.1%
 
ValueCountFrequency (%) 
100001< 0.1%
 
99991< 0.1%
 
85001< 0.1%
 
77031< 0.1%
 
75001< 0.1%
 

minimum_nights
Real number (ℝ≥0)

SKEWED

Distinct75
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.9921
Minimum1
Maximum1250
Zeros0
Zeros (%)0.0%
Memory size156.2 KiB
2021-10-02T07:40:34.259175image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q35
95-th percentile30
Maximum1250
Range1249
Interquartile range (IQR)4

Descriptive statistics

Standard deviation21.64544903
Coefficient of variation (CV)3.095700724
Kurtosis1072.167601
Mean6.9921
Median Absolute Deviation (MAD)1
Skewness25.17996962
Sum139842
Variance468.5254639
MonotocityNot monotonic
2021-10-02T07:40:34.445285image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1524826.2%
 
2479624.0%
 
3330916.5%
 
3015407.7%
 
413286.6%
 
512086.0%
 
78554.3%
 
63071.5%
 
142171.1%
 
101830.9%
 
Other values (65)10095.0%
 
ValueCountFrequency (%) 
1524826.2%
 
2479624.0%
 
3330916.5%
 
413286.6%
 
512086.0%
 
ValueCountFrequency (%) 
12501< 0.1%
 
9992< 0.1%
 
4801< 0.1%
 
4001< 0.1%
 
3701< 0.1%
 

number_of_reviews
Real number (ℝ≥0)

ZEROS

Distinct323
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.2741
Minimum0
Maximum607
Zeros4123
Zeros (%)20.6%
Memory size156.2 KiB
2021-10-02T07:40:34.736545image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median5
Q323
95-th percentile114
Maximum607
Range607
Interquartile range (IQR)22

Descriptive statistics

Standard deviation44.92779312
Coefficient of variation (CV)1.930377248
Kurtosis20.22980666
Mean23.2741
Median Absolute Deviation (MAD)5
Skewness3.761375507
Sum465482
Variance2018.506595
MonotocityNot monotonic
2021-10-02T07:40:35.220225image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0412320.6%
 
1213110.7%
 
213947.0%
 
310335.2%
 
48274.1%
 
56313.2%
 
65602.8%
 
75132.6%
 
84692.3%
 
94002.0%
 
Other values (313)791939.6%
 
ValueCountFrequency (%) 
0412320.6%
 
1213110.7%
 
213947.0%
 
310335.2%
 
48274.1%
 
ValueCountFrequency (%) 
6071< 0.1%
 
5941< 0.1%
 
5101< 0.1%
 
4881< 0.1%
 
4741< 0.1%
 

last_review
Categorical

HIGH CARDINALITY
MISSING

Distinct1507
Distinct (%)9.5%
Missing4123
Missing (%)20.6%
Memory size156.2 KiB
2019-06-23
 
575
2019-07-01
 
557
2019-06-30
 
553
2019-06-24
 
350
2019-07-07
 
292
Other values (1502)
13550 
ValueCountFrequency (%) 
2019-06-235752.9%
 
2019-07-015572.8%
 
2019-06-305532.8%
 
2019-06-243501.8%
 
2019-07-072921.5%
 
2019-07-022801.4%
 
2019-06-222701.4%
 
2019-07-052521.3%
 
2019-06-162451.2%
 
2019-07-062401.2%
 
Other values (1497)1226361.3%
 
(Missing)412320.6%
 
2021-10-02T07:40:35.570410image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Frequencies of value counts

Unique

Unique358 ?
Unique (%)2.3%
2021-10-02T07:40:36.016732image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length8.55695
Min length3

reviews_per_month
Real number (ℝ≥0)

MISSING

Distinct790
Distinct (%)5.0%
Missing4123
Missing (%)20.6%
Infinite0
Infinite (%)0.0%
Mean1.377445991
Minimum0.01
Maximum27.95
Zeros0
Zeros (%)0.0%
Memory size156.2 KiB
2021-10-02T07:40:36.221382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.04
Q10.19
median0.72
Q32.01
95-th percentile4.67
Maximum27.95
Range27.94
Interquartile range (IQR)1.82

Descriptive statistics

Standard deviation1.683005621
Coefficient of variation (CV)1.22183057
Kurtosis11.95169963
Mean1.377445991
Median Absolute Deviation (MAD)0.62
Skewness2.435799259
Sum21869.71
Variance2.83250792
MonotocityNot monotonic
2021-10-02T07:40:36.488521image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
13781.9%
 
0.023781.9%
 
0.053521.8%
 
0.033341.7%
 
0.042741.4%
 
0.082631.3%
 
0.162551.3%
 
0.092461.2%
 
0.062341.2%
 
0.112221.1%
 
Other values (780)1294164.7%
 
(Missing)412320.6%
 
ValueCountFrequency (%) 
0.01180.1%
 
0.023781.9%
 
0.033341.7%
 
0.042741.4%
 
0.053521.8%
 
ValueCountFrequency (%) 
27.951< 0.1%
 
20.941< 0.1%
 
19.751< 0.1%
 
17.821< 0.1%
 
16.221< 0.1%
 

calculated_host_listings_count
Real number (ℝ≥0)

Distinct47
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.95545
Minimum1
Maximum327
Zeros0
Zeros (%)0.0%
Memory size156.2 KiB
2021-10-02T07:40:36.900440image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile14
Maximum327
Range326
Interquartile range (IQR)1

Descriptive statistics

Standard deviation32.43383053
Coefficient of variation (CV)4.663081545
Kurtosis70.36535249
Mean6.95545
Median Absolute Deviation (MAD)0
Skewness8.096123898
Sum139109
Variance1051.953363
MonotocityNot monotonic
2021-10-02T07:40:37.075837image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%) 
11329066.5%
 
2268813.4%
 
311745.9%
 
45702.9%
 
53611.8%
 
62261.1%
 
81680.8%
 
71660.8%
 
3271310.7%
 
91030.5%
 
Other values (37)11235.6%
 
ValueCountFrequency (%) 
11329066.5%
 
2268813.4%
 
311745.9%
 
45702.9%
 
53611.8%
 
ValueCountFrequency (%) 
3271310.7%
 
232890.4%
 
121530.3%
 
103360.2%
 
96730.4%
 

availability_365
Real number (ℝ≥0)

ZEROS

Distinct366
Distinct (%)1.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean112.9012
Minimum0
Maximum365
Zeros7176
Zeros (%)35.9%
Memory size156.2 KiB
2021-10-02T07:40:37.484712image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median44
Q3229
95-th percentile359
Maximum365
Range365
Interquartile range (IQR)229

Descriptive statistics

Standard deviation131.7622264
Coefficient of variation (CV)1.167057803
Kurtosis-1.007092224
Mean112.9012
Median Absolute Deviation (MAD)44
Skewness0.75940929
Sum2258024
Variance17361.2843
MonotocityNot monotonic
2021-10-02T07:40:37.869942image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0717635.9%
 
3655102.5%
 
3642251.1%
 
11810.9%
 
51360.7%
 
891240.6%
 
1791210.6%
 
21190.6%
 
31180.6%
 
41100.5%
 
Other values (356)1118055.9%
 
ValueCountFrequency (%) 
0717635.9%
 
11810.9%
 
21190.6%
 
31180.6%
 
41100.5%
 
ValueCountFrequency (%) 
3655102.5%
 
3642251.1%
 
363980.5%
 
362690.3%
 
361480.2%
 

Interactions

2021-10-02T07:39:36.704156image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:37.272941image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:37.925781image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:38.504976image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:38.948642image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:39.365307image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:39.802913image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:40.256800image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:40.711847image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:41.151416image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:41.653441image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:42.137053image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:42.577321image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:43.011819image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:43.421764image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:43.864871image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:44.296442image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:44.742308image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:45.142596image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:45.538959image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:45.983396image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:46.540329image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:47.012609image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:47.563344image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:48.045578image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:48.526008image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:49.037383image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:49.493144image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:50.075292image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:50.656431image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:51.284020image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:51.832023image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:52.315541image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:52.898300image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:53.351559image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:54.001899image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:54.511932image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:55.002911image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:55.518579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:56.059947image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:56.590416image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:57.078008image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:57.521595image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:58.062574image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:58.543233image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:58.990961image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:39:59.492787image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:00.018597image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:00.492382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:00.993749image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:01.467605image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:01.954951image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:02.456560image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:02.993416image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:03.484846image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:03.916352image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:04.229024image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:04.604799image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:05.031806image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:05.448420image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:05.861645image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:06.307552image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:06.697927image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:07.132221image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:07.520087image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:07.899551image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:08.333706image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:08.767788image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:09.164838image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:09.553271image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:10.014279image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:10.478467image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:11.025568image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:11.365526image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:11.733663image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:12.147401image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:12.902959image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:13.359803image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:13.786299image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:14.238870image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:14.716513image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:15.141641image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:15.552235image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:16.036470image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:16.438729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:16.839598image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:17.297624image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:17.708872image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:18.087445image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:18.515251image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:18.955685image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:19.410565image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:19.877763image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:20.319357image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:20.721068image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:21.165115image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:21.583043image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:22.012047image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:22.449268image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:22.928410image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-10-02T07:40:38.103198image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-10-02T07:40:38.766840image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-10-02T07:40:39.238651image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-10-02T07:40:39.906206image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-10-02T07:40:40.450933image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-10-02T07:40:23.750141image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:24.947923image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:25.751017image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-10-02T07:40:26.201821image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Sample

First rows

idnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
09138664Private Lg Room 15 min to Manhattan47594947IrisQueensSunnyside40.74271-73.92493Private room74262019-05-260.1315
131444015TIME SQUARE CHARMING ONE BED IN HELL'S KITCHEN,NYC8523790JohlexManhattanHell's Kitchen40.76682-73.98878Entire home/apt17030NaNNaN1188
28741020Voted #1 Location Quintessential 1BR W Village Apt45854238JohnManhattanWest Village40.73631-74.00611Entire home/apt2453512018-09-191.1210
334602077Spacious 1 bedroom apartment 15min from Manhattan261055465ReganQueensAstoria40.76424-73.92351Entire home/apt125312019-05-240.65113
423203149Big beautiful bedroom in huge Bushwick apartment143460MeganBrooklynBushwick40.69839-73.92044Private room65282019-06-230.5228
54402805LRG 2br BKLYN APT CLOSE TO TRAINS AND PARK22807362JennyBrooklynProspect-Lefferts Gardens40.66025-73.96270Entire home/apt120332018-08-280.05116
630070126✩Prime Renovated 1/1 Apartment in Upper East Side✩4968673SeanManhattanUpper East Side40.76831-73.95929Entire home/apt200522019-05-260.68171
734231172Fully renovated brick house floor in Brooklyn59642348KevinBrooklynSunset Park40.64550-74.01262Entire home/apt95192019-07-089.001106
85856760Renovated 1BR in exciting, convenient area29408349ChadManhattanChinatown40.71490-73.99976Entire home/apt179572017-04-180.1410
97929441Beautiful Loft w/ Waterfront View!1453898AnthonyBrooklynWilliamsburg40.71268-73.96676Private room10522322019-06-195.00364

Last rows

idnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
199905192459Quiet Room in 4BR UWS Brownstone10677483GregManhattanUpper West Side40.80173-73.96625Private room7010NaNNaN10
199911327940Huge Gorgeous Park View Apartment!3290436HadarBrooklynFlatbush40.65335-73.96257Entire home/apt1203132016-08-270.282327
1999223612681Shared Room 1 Stop from Manhattan on the F Train55724558TaylorQueensLong Island City40.76006-73.94080Private room55422019-06-010.65589
1999334485745Midtown Manhattan Stunner - Private room261632622RoyaltonManhattanTheater District40.75491-73.98507Private room100132019-06-163.009318
1999425616250Stylish, spacious, private 1BR apt in Ditmas Park125396920AdamBrooklynFlatbush40.64314-73.95705Entire home/apt753102019-01-030.8410
199957094539Tranquil haven in bubbly Brooklyn2052211AdrianaBrooklynWindsor Terrace40.65360-73.97546Entire home/apt1431422016-08-270.04110
199964424261Large 1 BR with backyard on UWS3447311SarahManhattanUpper West Side40.80188-73.96808Entire home/apt2002222019-05-210.5010
199974545882Amazing studio/Loft with a backyard23569951KavehManhattanUpper East Side40.78110-73.94567Entire home/apt2203282019-05-230.501293
1999826518547U2 comfortable double bed sleeps 2 guests295128Carol GloriaBronxClason Point40.81225-73.85502Private room80142019-07-011.487365
1999933631782Private Bedroom in Williamsburg Apt!8569221AndiBrooklynWilliamsburg40.71829-73.95819Private room109332019-04-281.07297